A Comment on the Tsallis Maximum Entropy Principle 
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Tsallis has suggested a nonextensive generalization of the 
Boltzmann-Gibbs entropy, the maximization of which gives a 
generalized canonical distribution under special constraints. 
In this brief report we show that the generalized canonical 
distribution so obtained may differ from that predicted by 
the law of large numbers when empirical samples are held 
to the same constraint. This conclusion is based on a result 
regarding the large deviation property of conditional measures 
and is confirmed by numerical evidence. 

02.50.Cw, 05.20.Gg, 05.30.Ch 



I. INTRODUCTION 

From considerations of multifractals, Tsallis O was led 
to conjecture a generalization of the Boltzmann-Gibbs 
entropy given by 
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where p — (pi, . . . ,Pm) is a probability distribution for a 
discrete random variable with values ei, . . . , Cm and q is 
any real number different from one. 5*1 is defined to be 
the usual Boltzmann-Gibbs entropy, in agreement with 
the limit q — > I. (Boltzmann's constant is set to one.) 
Non-Gibbsian distributions are obtained by extremizing 
the Tsallis entropy under special constraints, described 
below, while using q as an adjustable parameter. The 
parameter q typically has no direct physical interpreta- 
tion, but when it is used as an adjustable parameter the 
resulting distributions can give surprisingly good agree- 
ment with experimental data in a wide variety of fields 
B. In a few cases, q is uniquely determined by the con- 
straints of the problem and may thereby bear some phys- 
ical interpretation PP]. 

Although the Tsallis entropy preserves all of the famil- 
iar thermodynamic formalism, Curado [g| has noted that 
this is true of a much broader class of entropies. Given 
the myriad of possible entropy functions, one is led to ask 
why the Tsallis entropy is special, and a natural place to 
look for answers is in the theory of large deviations [g| , 
which gives a probabilistic justification for the maximum 
entropy principle in terms of a unique entropy function. 
In this brief report we compare the probabilities obtained 



by Tsallis 's maximum entropy principle with the asymp- 
totic frequencies predicted by large deviation theory (i.e. 
the law of large numbers) under similar constraints. We 
find that the two do not in general agree. 



II. TSALLIS MAXIMUM ENTROPY PRINCIPLE 

If no constraints are imposed upon p (other than that 
it be nonnegative and normalized), Sq is readily seen to 
be extremized by pi = l/m = ^i. (The case q = is 
special, as Sq is a constant function.) This conclusion, 
independent of q, agrees with the usual Boltzmann-Gibbs 
result and corresponds to a microcanonical ensemble. If 
we view ^ as a sampling distribution, then the empirical 
distribution of frequencies obtained from a random sam- 
ple xi, . . . ,Xn converges to /i almost surely as n grows 
large. This well-known result, originally due to Boltz- 
mann |7|, may be viewed as a example of the (strong) 
law of large numbers. Since Sq has a global extremum 
at fj,, the distribution predicted by extremizing Sq agrees 
with the actual asymptotic empirical distribution. 

Placing additional constraints when extremizing Sq 
may result in a distribution dependent upon g, i.e. one at 
variance with that predicted from the Boltzmann-Gibbs 
case q = 1. As a generalization of the internal energy con- 
straint, Tsallis g has suggested the following constraint 
be used when extremizing Sq: 
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where m is a given fixed constant. For q = I this of course 
reduces to the usual expectation value constraint. By 
extremizing (|l|) subject to (^, one obtains a solution in 
general different from the Boltzmann distribution. This 
solution is given explicitly by 



Pi cc [1 - {1 - q)a{ei - u)] 
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where a is chosen such that Eqn. (g) is satisfied. It has 
been noted that this explicit form of the distribution ap- 
pears to be more numerically robust than the more com- 
mon implicit form, for which a = fi/ X)^i P? @- 

For q = 1 the constraint on the expectation may be in- 
terpretation as a constraint on the sample mean, the two 
being equivalent for large samples. Thus, if we consider 
random samples xi, . . . , x„ from /i which satisfy 
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then the empirical distributions of such samples will ap- 
proach the Boltzmann distribution pi oc e""*^' as n grows 
large. 

The question arises whether a similar interpretation 
may be made of the constraint in Eqn. (0) for q ^ 1 
and, more importantly, whether the resulting empirical 
distribution converges to that given by Eqn. (0). As our 
observable is discrete, let fn,i(xi, . . . , x„) denote the ob- 
served frequency of e^ in the sample xi, . . . , a;„. (There 
is no obvious interpretation for continuous values.) We 
may interpret Eqn. (0) to mean 



It is in this sense that finding the asymptotic empirical 
distribution under (|7|) is equivalent to maximizing 5*1 un- 
der (I). 

More generally, imposing condition (|5|) results in an 
asymptotic distribution which minimizes /^ (maximizes 
Si) subject to (y). This distribution is given implicitly 
by 
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where (5 is such that Eqn. (0) is satisfied with p replaced 
by P. Comparison with Eqn. (||) shows that both p and 
P will agree when g — > 1. 
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We will show that random samples drawn from /i which 
satisfy Eqn. (H) do not in general give rise to empirical 
distributions which converge to the Tsallis prediction of 
Eqn. (§. 



III. CONDITIONAL CONVERGENCE OF THE 
EMPIRICAL DISTRIBUTION 

The general problem we are considering is the conver- 
gence in probability of the empirical frequencies /„ — 
(/n,i, • ■ ■ , fn,m), where /„ is a random vector with do- 
main {ei, . . . , Em}" taking values in the convex set V = 
{p e M'" : Pi > 0,J2^iPi = !}• Unconstrained, an 
infinite random sample xi,X2, ■ ■ ■ , from fi gives rise to a 
sequence of empirical frequencies which converge in prob- 
ability to fi. Sanov's theorem m gives the large deviation 
rate function for this convergence to be just the negative 
of the Boltzmann-Gibbs entropy: 



If^ip) = -Slip) -logm. 
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Loosely speaking, Sanov's theorem states that for ACT-", 
M"[/n S A] ^ exp[—nmip^Alii{p)] for large n (cf. the 
Boltzmann-Einstein formula W = e ). The asymptotic 
measure, /x, is the unique minimum of the rate function 
/^, which is continuous and strictly convex. 

When we impose additional constraints on /„, the 
asymptotic value changes from /i to a new distribution 
which minimizes /^ under the added restrictions |0,|lO[. 
If we condition on the sample mean for example, i.e. 



/ ^ ^ijn,i \XIt • • • ^ Xjij — W, 
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the resulting asymptotic distribution is no longer /z but 
the canonical distribution Pi ex e~^'^\ where (3 satisfies 
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IV. COMPARISON OF THE TWO 
DISTRIBUTIONS 

For q = 0, Eqn. (g) gives pi — [1 — a{ei — w)]/m, with 
a unrestricted, while Eqn. (0) implies Pi — 1/m. Clearly 
both agree if a is arbitrarily chosen to be zero. However, 
as we have noted Sq is a constant function, so the entropy 
extremization procedure may be expected to break down 
in this case. 

Taking u to be the equilibrium value u^ — X]i=i ^ilf^ 
also results in general agreement between p and P for 
all q ^ Q. Indeed, by choosing a = /3 = we see that 
Pi — \ljn is the unique solution for both Eqn. (g) and 
Eqn. (P). This agreement simply reflects that fact that 
both 5*1 and Sq have the same global extremum. 

When m = 2 the two constraints are sufficient to 
uniquely determine the distribution, and for this reason 
general agreement is also expected. In particular we find 
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assuming ei < 62 and q ^ Q. It is readily verified that 
JI) is satified. By solving for a and /?, Eqns. (||) 
), respectively, may be satisfied as well. 

Disagreement between p and P is therefore expected 
when m > 3. To show this explicitly, we may compute 
p from Eqn. (^ for an arbitary u and then search for 
a value of (3 such that Eqn. (P) is satisfied when p is 
substituted for P. The claim is that a single (3 cannot 
always be found which satisfies this equation for all values 
of i when ?ti > 3. 

The case q — 1/2 is particularly amenable to analytic 
study pTJ I and appears in an early application of the Tsal- 
lis entropy to turbulence in a two-dimensional electron 
plasma Q. For this case, Eqn. (||) may be solved ex- 
plicitly in terms of u to obtain 
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Using a given value of u and the corresponding p given 
above, we then consider zeros of the functions d,, where 
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for i = 1, . . . , TO. A plot of these functions is shown in Fig. 
|l| for selected parameter values. The failure of all three 
graphs to have a zero at the same value of (3 indicates 
that p and P are in this case distinct. 

From this example one can derive a general necessary 
condition for agreement with P. Suppose that for given 
q, e, and u there exists a simultaneous solution to both 
Eqns. (0) and (0). (More generally, p may be any prob- 
ability distribution satisfying Eqn. (g).) Substituting the 
former into the latter we find 



exph/3(e,-u)pr ]/^(/9), 



was used to demonstrate numerically that the two dis- 

(12) tributions may be different. For the case in which none 
of these three conditions hold, we derived a necessary 
condition to be satisfied by any candidate distribution in 
order that it be identical to true asymptotic distribution. 

From the point of view of large deviation theory, the 
maximum entropy principle specifies the overwhelmingly 
most probable distribution to be realized by a large- 
sample empirical distribution under given constraints. 
The uniqueness of the rate function in large deviation 
theory implies that the Boltzmann-Gibbs entropy plays 
a special role in determining this most likely distribution. 
For this reason, novel entropy functions such as that pro- 
posed by Tsallis may give results which are at variance 
with actual sample frequencies except, as observed, in 

(13) some special cases. 



where 



Z(/3) = > exp[-/3(6,-w)pr] 
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The value of each p^ is fixed in terms of the given param- 
eters, so a single value of j3 must simultaneously satisfy 
Eqn. (O) for i = 1, . . . ,TO,. If any pi ~ Q then Eqn. 
( [l3| ) cannot possibly be satisfied, so suppose all pi are 
nonzero. For any given j ^ i. 



P = -[\ogp J + log Zm/[{e,-u)p] 
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Substituting this expression back into Eqn. (13) gives 



logZ(/3) 



{e, -u)p1 Mogpj - (ej 



u)p'\ ^logpi 



{ej - u)p'^j - (ej - u)p\ 
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The RHS of Eqn. (Ilq) is invariant under the inter- 
change of i and j, so it has at most to(to, — l)/2 distinct 
values. The LHS, of course, is the same for all choices of 
i and j. Now, the RHS will be independent of the choice 
of i and j if either (1) g = 1, (2) to, = 2, or (3) pi — pj 
for all i and j, the latter being equivalent to u — u,, 
which is equivalent to a — 0. Assuming none of these 
three conditions hold, the RHS must be the same for all 
choices of i and j if indeed p = P. This gives a necessary 
condition for agreement. 
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FIG. 1. Plotofd,(/3) = p,(/3)-p, fore = (0, 1, 2), g = 1/2, 
and u = 7/11, for which p = (289, 121, 25)/435. The pos- 
itive roots are found numerically to be 0.514509, 0.637715, 
0.360903 for z = 1, 2, 3 respectively. 



V. DISCUSSION 



We have compared the probability distribution over m 
states predicted from Tsallis's maximum entropy prin- 
ciple, which constrains the normalized g-expectation to 
a value u, to the asymptotic frequencies when the em- 
pirical g-expectation is similarly constrained. The two 
will always agree if either (1) q — I, (2) m = 2, or (3) 
u — u^. A specific example for which q = 1/2 and to = 3 
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